{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/home/archie/anaconda3/envs/lic_env/lib/python3.8/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
      "  from .autonotebook import tqdm as notebook_tqdm\n"
     ]
    }
   ],
   "source": [
    "%load_ext autoreload\n",
    "%autoreload 2\n",
    "\n",
    "import seaborn as sns\n",
    "from dataclasses import dataclass\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "import datautils\n",
    "from utils import init_dl_program\n",
    "from utils import find_closest_train_segment\n",
    "from trep import TRep\n",
    "\n",
    "sns.set_theme()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# **T-Rep tutorial**\n",
    "\n",
    "The goal of this tutorial is to show you in depth:\n",
    "\n",
    "1. How to instantiate T-Rep with the most important parameters.\n",
    "2. How to train T-Rep.\n",
    "3. How to use a trained model to encode test data, at different granularities.\n",
    "\n",
    "This tutorial is more in-depth than the 'quick tutorial', as it aims to explain the parameters of the various functions used."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **1. Instantiate ``Args`` Configuration Class**\n",
    "\n",
    "The ``Args`` class is normally imported from the `train_eval.py` file but it is redefined here with extra comments for clarity."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "@dataclass\n",
    "class Args:\n",
    "    # MODEL PARAMETERS\n",
    "    task_weights: dict # Weights to attribute to each pretext task\n",
    "    repr_dims: int = 128 # Latent representation dimensionality\n",
    "    time_embedding: str = None # Time embedding to use ('t2v_sin', 'fully_learnable_big', 'gaussian', 'hybrid' etc.), 'None' for no time embedding. All implemented time-embeddings are defined in models.time_embeddings.py\n",
    "\n",
    "    # TRAINING PARAMETERS\n",
    "    epochs: int = 80 # Maximum number of training epochs\n",
    "    iters: int = None # Maximum number of training iterations. Can be set to 'None' if epochs is set.\n",
    "    batch_size: int = 16 # Training batch size\n",
    "    lr: float = 0.001 # Learning rate\n",
    "    seed: int = 1234 # Random seed for reproducibility\n",
    "    max_train_length: int = 800 # Maximum sequence length (depends on your GPU memory). \n",
    "                           # Longer sequences will be cut into smaller sequences.\n",
    "\n",
    "    # CONFIGURATION\n",
    "    dataset: str = \"\" # Set to \"\" if using your own dataset. If you use UCR/UEA datasets, the dataset name\n",
    "    loader: str = \"\" # Set to \"\" if using your own dataset, otherwise \"UEA\" or \"UCR\"\n",
    "    gpu: int = 0 # The gpu no. used for training and inference (defaults to 0)\n",
    "    run_name: str = \"\" # Run name to save model\n",
    "    save_every = None # Save the model checkpoint every <save_every> iterations/epochs\n",
    "    max_threads = None # The maximum allowed number of threads used by this process. Set to None if unsure.\n",
    "    eval: bool = True # Evaluate model after training if True (doesn't work for custom datasets, only UCR/UEA/ETT/Yahoo)\n",
    "    irregular: float = 0.0 # Ratio of missing data (defaults to 0). Used for testing model resilience under missing data regime\n",
    "    label_ratio: int = 1.0 # Ratio of available training labels (defaults to 1.0). Used for testing model resilience under missing labels regime\n",
    "    "
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Create an instance of arguments, specifying the necessary arguments and those important to your use case.\n",
    "- Initialise device as well as config dict"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "args = Args(\n",
    "    repr_dims=128,\n",
    "    time_embedding='t2v_sin',\n",
    "    task_weights={\n",
    "        'instance_contrast': 0.25,\n",
    "        'temporal_contrast': 0.25,\n",
    "        'tembed_jsd_pred': 0.25,\n",
    "        'tembed_cond_pred': 0.25, \n",
    "    },\n",
    "    eval=False,\n",
    "    batch_size=32,\n",
    ")\n",
    "\n",
    "device = init_dl_program(args.gpu, seed=args.seed, max_threads=args.max_threads)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **2. Load your data**"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can use any data, as long as it is an `np.ndarray` of shape $(N, T, C)$ where $N$ is the number of time-series instances, $T$ the number of timesteps per instance, and $C$ the number of channels.\n",
    "\n",
    "Here, we use a UCR dataset as an example, but you can use any dataset of yours. \n",
    "\n",
    "**N.B:** For the following cell to work, you will have to have downloaded the `UCR` datasets and placed them in the `datasets/UCR/` folder as instructed in the `README.md`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Shapes - train data: (1, 8640, 7), test data: (1, 2880, 7)\n"
     ]
    }
   ],
   "source": [
    "data, train_slice, valid_slice, test_slice, scaler, pred_lens = datautils.load_forecast_csv(\"ETTh1\")\n",
    "train_data = data[:, train_slice]\n",
    "test_data = data[:, test_slice]\n",
    "print(f\"Shapes - train data: {train_data.shape}, test data: {test_data.shape}\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **3. Create and train T-Rep**"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The model is a Pytorch Module, so you can easily save and load parameters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Training data shape: (10, 864, 7)\n"
     ]
    }
   ],
   "source": [
    "model = TRep(\n",
    "    input_dims=train_data.shape[-1],\n",
    "    device=device,\n",
    "    time_embedding=args.time_embedding,\n",
    "    task_weights=args.task_weights,\n",
    "    batch_size=args.batch_size,\n",
    "    lr=args.lr,\n",
    "    output_dims=args.repr_dims,\n",
    "    max_train_length=args.max_train_length\n",
    ")\n",
    "\n",
    "loss_log = model.fit(\n",
    "    train_data,\n",
    "    n_epochs=args.epochs,\n",
    "    n_iters=args.iters,\n",
    "    verbose=1,\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **4. Encode representations with the trained model**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When creating the train and test instances of a dataset, there are two methods:\n",
    "1. The most common for classification and clustering is to separate your train and test datasets by choosing different time series instances. This above means that for a dataset $X \\in \\mathbb{R}^{B \\times T \\times F}$, you define $X_{train}$ by slicing X along the instances or batch axis: `X_train = X[:n_train, :, :]`.\n",
    "2. For forecasting and anomaly detection, another way to build your train and test set is to use all instances up to timestep $T_{train}$ for training, and further timesteps for testing: `X_train = X[:, :T_train, :]`.\n",
    "\n",
    "If using the first method, please skip to section `4.1`. If using the second method, please read through section `4.0.`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### **4.0. Using correct test-set time indices**"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When splitting your train and test sets along the time axis, it is import to adequately label the timesteps corresponding to the test set: T-Rep uses timesteps to compute time-embeddings, so one shouldn't naively use timesteps $[T_{train}:T_{end}]$, or reindex the test set from timestep 0.\n",
    "\n",
    "As the model was trained on a previous section of the dataset $X_{train} = [x_{t_0}...x_{T_{train}}]$, with corresponding timesteps $[t_0...T_{train}]$, we will try to find the subsequence of $X_{train}$ which most closely resembles our test set (call that subsequence $[x_{t_a}:x_{t_b}]$, ranging from timesteps $t_a$ to $t_b$). When encoding our test set, we then feed T-Rep $X_{test}$ alongside that subsequence's timesteps $[t_a:t_b]$. This ensures we don't feed out-of-distribution inputs (timesteps) to the time-embedding module. \n",
    "\n",
    "Finding the closest segment to $X_{test}$ in the train data is very easily done using the `find_closest_train_segment` function, which uses a sliding window and the Euclidean distance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(1, 2880, 1)"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "closest_time_indices = find_closest_train_segment(\n",
    "    train_data,\n",
    "    test_data,\n",
    "    squared_dist=True\n",
    ")\n",
    "closest_time_indices.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### **4.1. Encoding representations for forecasting and anomaly detection**"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Encode representations at a timestep granularity (one representation vector per timestep), preserving the original data's temporality. This is typically what you might use for **forecasting** or **anomaly detection**."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Fine-grained (timestep-wise) representation shape: (1, 2880, 128)\n"
     ]
    }
   ],
   "source": [
    "test_repr_fine = model.encode(\n",
    "    data=test_data,\n",
    "    time_indices=closest_time_indices,\n",
    "    mask=None, # Used for the Anomaly Detection protocol, can be ignored\n",
    "    encoding_window=None, # Used to control the temporal granularity of the representation\n",
    "    causal=True, # Whether to use causal convolutions (for forecasting you might want this) or not.\n",
    "    sliding_length=1, # The length of sliding window. When this param is specified, a sliding inference would be applied on the time series.\n",
    "    sliding_padding=100, # Contextual data length used for inference every sliding windows. The timestamp t's representation vector is computed using the observations located in [t - sliding_padding, t].\n",
    "    batch_size=16,\n",
    "    return_time_embeddings=False\n",
    ")\n",
    "print(f\"Fine-grained (timestep-wise) representation shape: {test_repr_fine.shape}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### **4.2. Encoding representations for classification and clustering**"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Encode representations at an instance granularity (one representation vector per time-series instance), eliminating the temporal dimension of the data. This is more typically used for **classification** or **clustering**. For these tasks, we often discard the temporal dimension as we care more about **inter-instance** differences than **intra-instance** differences. In most cases, reducing each instance to one representation vector is enough, and helps reduce the intrinsic dimensionality of our problem.\n",
    "\n",
    "To encode representations at an instance granularity, simply set `encoding_window='full_series'`, which will apply a maxpool operation to the temporal dimension of the representation with a kernel size equal to the length of the time series, resulting in a temporal dimension of 1. You thus obtain one representation vector for entire time series instance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Instance-wide representation shape: (1, 128)\n"
     ]
    }
   ],
   "source": [
    "test_repr_coarse = model.encode(\n",
    "    data=test_data,\n",
    "    time_indices=closest_time_indices,\n",
    "    mask=None, # Used for the Anomaly Detection protocol, can be ignored\n",
    "    encoding_window='full_series', # Used to control the temporal granularity of the representation\n",
    "    causal=False, # Whether to use causal convolutions (for forecasting for instance) or not.\n",
    "    sliding_length=None, # The length of sliding window. When this param is specified, a sliding inference would be applied on the time series.\n",
    "    sliding_padding=0, # Contextual data length used for inference every sliding windows.\n",
    "    batch_size=16,\n",
    "    return_time_embeddings=False\n",
    ")\n",
    "print(f\"Instance-wide representation shape: {test_repr_coarse.shape}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### **4.3. Encoding representations with custom temporal granularity**"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Encode representations at a custom temporal granularity, by setting the ``encoding_window`` parameter to an integer. This integer specifies the kernel size that will be used to apply a **maxpool** operation to the timestep-level representation. This may be desirable for more advanced use cases in classification, clustering, anomaly detection, forecasting, or any other downstream tasks.\n",
    "\n",
    "This works by first computing the representation at full temporal granularity, i.e. with the same number of timesteps as the original data. A maxpool operation is then applied to the temporal dimension of the representation, with kernel size controled by the `encoding_window` parameter of the encoding function. The `stride` and `padding` are both set to `encoding_window // 2`. The representation's temporal dimensionality can be pre-determiend using the usual maxpool dimension formula."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Custom temporal resolution representation shape: (1, 115, 128)\n"
     ]
    }
   ],
   "source": [
    "test_repr_custom = model.encode(\n",
    "    data=test_data,\n",
    "    time_indices=closest_time_indices,\n",
    "    mask=None, # Used for the Anomaly Detection protocol, can be ignored\n",
    "    encoding_window=50, # Used to control the temporal granularity of the representation\n",
    "    causal=False, # Whether to use causal convolutions (for forecasting for instance) or not.\n",
    "    sliding_length=None, # The length of sliding window. When this param is specified, a sliding inference would be applied on the time series.\n",
    "    sliding_padding=0, # Contextual data length used for inference every sliding windows.\n",
    "    batch_size=16,\n",
    "    return_time_embeddings=False\n",
    ")\n",
    "print(f\"Custom temporal resolution representation shape: {test_repr_custom.shape}\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A special value can be passed to the ``encoding_window`` parameter of the encoding function: it can be set to 'multiscale'. This will concatenate representations at multiple temporal granularities, resulting in a representation which incorporates both global and local information at every timestep. Essentially this results in a larger representation dimensionality, but no temporal resolution change."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Multiscale representation shape: (1, 2880, 1024)\n"
     ]
    }
   ],
   "source": [
    "test_repr_multiscale = model.encode(\n",
    "    data=test_data,\n",
    "    time_indices=closest_time_indices,\n",
    "    mask=None, # Used for the Anomaly Detection protocol, can be ignored\n",
    "    encoding_window='multiscale', # Used to control the temporal granularity of the representation\n",
    "    causal=False, # Whether to use causal convolutions (for forecasting for instance) or not.\n",
    "    sliding_length=1, # The length of sliding window. When this param is specified, a sliding inference would be applied on the time series.\n",
    "    sliding_padding=100, # Contextual data length used for inference every sliding windows.\n",
    "    batch_size=16,\n",
    "    return_time_embeddings=False\n",
    ")\n",
    "print(f\"Multiscale representation shape: {test_repr_multiscale.shape}\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### **That's it!**\n",
    "\n",
    "This is all you need to know to use T-Rep. The produced `np.ndarray` of representations can then be used as inputs for any task ranging from classification, clustering, forecasting, to anomaly detection etc."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "ts2vec_env",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.16"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
